ShARe/CLEF eHealth 2013 Normalization of Acronyms/Abbreviations Challenge
نویسندگان
چکیده
Objective: Abbreviations and acronyms are widely used in the clinical documents. This paper describes using of a machine learner to automatically extract spans of abbreviations and acronyms from clinical notes and map them to the UMLS (Unified Medical Language System) CUI (Concept Unique Identifier). Tasks: A Conditional Random Field (CRF) machine learner was used to identify abbreviations and acronyms. Firstly, the training data was converted to the CRF format. The different feature sets were applied with 10-fold cross validation to find the best feature set to create the machine learning model. Secondly, the identified spans for abbreviation/acronyms were mapped to the UMLS (Unified Medical Language System) CUIs. Thirdly, a rule based engine was applied for disambiguation of terms with multiple abbreviations or acronyms. Approach: A novel supervised learning model was developed that incorporates a machine learning algorithm and a rule-based engine. Evaluation of each step included precision, recall and F-score metrics for span detection and accuracy for CUI mapping. Resources: Several tools which were created in our laboratory were used, including a Text to SNOMED CT (TTSCT) service, Lexical Management System (LMS) and Ring-fencing approach. Also a set of gazetteers which had been created from the training data was employed. Results: A 10-fold cross validation on the training data showed 0.911 of precision, 0.887 of recall and a F-score of 0.899 for detecting the boundary of abbreviation/acronyms and an accuracy of 0.760 for CUI mapping while the official results on the test data showed strict accuracy of 0.447 and relaxed accuracy of 0.488 which is the third team out of the five participating teams. A supervised machine learning method with mixed computational strategies and rule based method for disambiguation of expansions seems to provide a nearoptimal strategy for automated extraction of abbreviation/acronyms.
منابع مشابه
Task 2: ShARe/CLEF eHealth Evaluation Lab 2013
In this pilot study, we aimed to generate a reference standard of clinical acronyms and abbreviations normalized to concepts from a standardized, medical vocabulary for the ShARe/CLEF eHealth 2013 challenge. In this paper, we review prior text normalization shared tasks, reference standard generation approaches, and recent clinical acronym and abbreviation normalization research. We report inte...
متن کاملClinical Acronym/Abbreviation Normalization using a Hybrid Approach
A unique characteristic of clinical text is the pervasive use of acronyms and abbreviations, which are often ambiguous. The ShARe/CLEF eHealth Evaluation Lab organized three shared tasks on clinical natural language processing (NLP) and information retrieval (IR) in 2013 and one of them was to normalize acronyms/abbreviations to UMLS concept unique identifiers (CUIs). This paper describes a hyb...
متن کاملNormalization of Abbreviations/Acronyms: THCIB at CLEF eHealth 2013 Task 2
This paper describes the THCIB systems that used in the ShARe/CLEF eHealth Lab 2013 task 2. We built a baseline system using open source software, and improve the performance by adding dictionaries. The dictionary is built from training set and web resource using the existing technologies. The experimental results show that adding dictionary of acronym/abbreviation can improve the performance s...
متن کاملNormalizing acronyms and abbreviations to aid patient understanding of clinical texts: ShARe/CLEF eHealth Challenge 2013, Task 2
BACKGROUND The ShARe/CLEF eHealth challenge lab aims to stimulate development of natural language processing and information retrieval technologies to aid patients in understanding their clinical reports. In clinical text, acronyms and abbreviations, also referenced as short forms, can be difficult for patients to understand. For one of three shared tasks in 2013 (Task 2), we generated a refere...
متن کاملOverview of the ShARe/CLEF eHealth Evaluation Lab 2014
This paper reports on the 2nd ShARe/CLEFeHealth evaluation lab which continues our evaluation resource building activities for the medical domain. In this lab we focus on patients’ information needs as opposed to the more common campaign focus of the specialised information needs of physicians and other healthcare workers. The usage scenario of the lab is to ease patients and next-of-kins’ ease...
متن کامل